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Summary 

This is the final report for FY2013 for the NASA Electronic Parts and Packaging (NEPP) program 
Double Data Rate (DDR) class 2 (DDR2) device reliability task. This task is focused on developing 
methods to improve DDR2 and DDR3 devices that may be used for space missions. The effort is based on 
identification of reasonable candidate devices and development of screening methods to ensure that 
compromised or lower-reliability devices are not used in space. 

High speed memory devices are needed for flight data and computing applications. More missions are 
turning to the DDR-class devices such as DDR2 and DDR3. Recent flight project incidences of 
problematic behavior of earlier generation synchronous dynamic random access memory (SDRAM), the 
functional precursors to DDR-class devices, have shown significant unexpected reliability anomalies. 
Some of these anomalies are from a subset of DDR devices that can be excluded from flight based on 
reliability screening. This task seeks to identify the most appropriate methods to employ to identify and 
remove reduced-reliability devices. 

This task follows from the FY12 task in the continuation of DDR2 screening. The plan coming forward 
from FY12 includes identification of outlier devices using standard device parametric measurements 
followed by a detailed evaluation of the DDR2 device’s ability to faithfully store data. The goal was to 
attempt to separate devices into two groups: the first group would be the main subset of similarly 
behaving devices, while the second group was the reduced reliability group. The latter group, as identified 
in this year’s work clearly shows undesirable features that should preclude their use in flight projects. 
Thus we did not carry out accelerated wear out testing since the devices were already compromised. Upon 
completion of the current campaign of DDR2 testing, this task will migrate to similar study of DDR3 
devices. 

Testing focused on 144 Hynix devices which were evaluated against multiple reliability tests. A more 
limited set of tests were carried out on 144 Micron and 144 Samsung DDR2 devices. We obtained 
nominal operating currents, in accordance with the standard datasheet measurements of supply current 
flowing to the Vdd pin (IDD). On the Hynix devices, we obtained data retention properties against nine 
different data patterns. The other devices are scheduled for similar testing in FY14, and some retention 
testing has already been performed. 
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1.0 INTRODUCTION 

This report covers work performed for the NEPP program’s DDR device reliability task. The focus of this 
work is improving the reliability of DDR2 and DDR3 devices being considered for flight projects. This 
year’s effort expands on last year’s work on DDR2 devices. The goal of this work is to improve the 
reliability of devices selected for flight use by application of long-duration screening tests that can 
identify outlier or lower reliability devices before they are put into a flight system. 

Last year’s work, reported in [1], included details about the test approach that we established in the wake 
of detailed life testing performed earlier in the task. The updated test approach, which is continued here, 
seeks to gather as much characterization data as possible with low-cost, high-volume, and long-duration 
testing. In particular we focus on running limited datasheet parametric verification, performing standard 
tests to ensure nominal device operation, and testing for proper operation and data retention under stress 
environments. 

The approach is targeted at expanding on the expected reliability testing performed by the manufacturer. 
It is known that each device must be tested at the factory in order to utilize redundant cells to mask out 
regions of the device that do not meet minimum requirements. It is this fact that makes the population of 
parts perform within such a narrow window of operating parameters. That is, since the most problematic 
regions of a device are removed, and all (not just a small sample) devices must meet minimum operating 
requirements during initial fabrication, the overall reliability and population statistics are fairly good. 
However, it is still true that a fraction of all devices are expected to have problems, with the estimate 
being between 0.1 and 1% of devices exhibiting problems when deployed. Thus, our approach is focused 
on what can be done during acceptance testing to identify and weed out the worst performing devices. 

This report is laid out as follows. The background information that defines the expected device behavior 
and testing concepts is discussed in Section 2. This is followed by the test plan in Section 3. We then 
provide detailed information on the test hardware and development during FY13 in Section 4. Test results 
are presented in Section 5. This is followed by future work, such as how this task carries forward to 
DDR3 devices, in Section 6. The report concludes with Section 7. 
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2.0 BACKGROUND 

This NEPP task is focused on improving the reliability of DDR2 and DDR3 devices used for space 
missions. As such, it makes sense to review the recent information regarding problems with SDRAM- 
type devices in space. In addition, field observations of DDR class devices can indicates appropriate areas 
for testing to improve reliability. 

2.1 Failure Mechanisms 

As indicated in last year’s report [1], complementary metal oxide semiconductor (CMOS) devices have 
many complex failure mechanisms. These mechanisms can be tested for specifically using test structures. 
However, test structures valid for all of the types of devices built into a DDR integrated circuit (IC) would 
be a fairly large set, and would not necessarily be applicable for a commercial device purchased through 
normal vendors. It is prohibitively expensive to participate in research programs with DDR manufacturers 
as details of their process for building devices is not something that is made available to users that 
purchase fewer than millions of parts. As such, many different types of failure mechanisms may be 
relevant to any given device, and this research task has very little ability to obtain relevant test data, 
outside of what is provided on the reliability of device lots from the manufacturer under sharing 
agreements. 

The standard failure mechanisms that can impact CMOS devices are electromigration, time-dependent 
dielectric breakdown, and hot-carrier injection. Each of these requires a specific set of reliability tests in 
order to explore. These tests require high and low temperature, maximum and minimum bias, and 
switching and constant-electric-field application. Because of the device- and lot-specific nature of 
failures, general reliability testing is of limited value for study unless it can provide general 
recommendations that can be used by flight projects. That is, we do not perform long-duration testing if it 
is a type of testing that is already recommended for flight parts — we only perform long-duration testing if 
it enhances the data obtained that highlights DDR-specific failure mechanisms). 

2.2 Basic Reliability Screening 

The reliability of DDR components is tested by the components’ manufacturers both before construction 
and by sampling of the units during and after construction. Knowledge of how the individual structures 
within the device may degrade, and how to test for the behaviors, allows the manufacturers to provide a 
highly reliable component. The information developed could be used to identify the devices with higher 
reliability than others, but there are two reasons why this is essentially meaningless for users. First, the 
tolerances on devices are very tight due to the very large quantities of devices produced. Second, the 
relevant information is provided to users of the scale of flight projects. 

For the reasons indicated above, screening of devices for basic reliability parameters and predicted 
degradation is out of scope of this task. Devices should, however, be screened for basic acceptance 
parameters, with standard screening tools. 

NASA programs are not without reasonable test efforts they can perform to increase the reliability of 
devices used. Rigorous operational testing can be performed on flight parts. This testing may take 
considerable time — which is one thing NASA programs have that manufacturers cannot afford on a 
device-by-device basis. In the test plan section below, we discuss the types of long duration testing that 
can be performed and what was done for reliability screening for outliers for this task. 

2.3 Flight Project Examples 

The goal of this task is to enable improved understanding of the reliability-based failure behaviors of 
DDR devices. Though there is limited information about DDR devices in flight projects, the behaviors of 
the earlier SDRAM devices are directly of interest. We seek to keep this task abreast of observed 
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anomalous behavior of flight components and to seek input on the effectiveness of test recommendations 
made by this task. 

The FY12 report discussed observation of bits with reduced reliability based on observations after launch. 
Due to limited pre-launch testing, the observations during flight were not able to be correlated to 
previously observed behavior. This was a key reason for changing the approach in this task to focus on 
using time to characterize behavior of devices rather than trying to identify devices that may degrade 
earlier than others. 

Due in-part to recommendations stemming from this task, increased screening of flight parts has begun. 
Testing using multiple data patterns has indicated weak bits in an upcoming flight project. Although this 
behavior was not expected, the testing was performed on flight equipment that will not be swapped, and 
the failures occurred at very high temperature, we will have the pre-flight data to compare to any 
anomalies that occur during flight. And, we have improved data on the general reliability of the devices 
used on this project. 

2.4 Field Observations 

In the FY12 report we highlighted the study of deployed DDR2 devices reported by Schroeder (Google) 
[1-2]. This section provides a quick review of this information. In testing of Google’s computer fleet over 

2.5 years, Schroeder found that 10% of dual inline memory modules (DIMMs) experience a correctable 
error (CE) and about 1% of DIMMs experience an uncorrectable error. Since DIMMs generally have on 
the order of ten devices, this results in rates per component of 1% and 0.1% for CE and uncorrectable 
errors, respectively. We used this information to establish the need to examine at least hundreds of 
devices in order to have a reasonable probability of having a device that demonstrates reduced reliability 
and can be useful for evaluating the effectiveness of our recommended test approach. 

2.5 Measurable Reliability Data 

This work focuses on collecting data that extends the expected reliability of flight parts by testing devices 
against longer duration (but still relatively short, such as a few hundred hours) characterization of devices. 
The types of data we can collect are the following: adherence to datasheet parameters, exposure to non- 
standard operating conditions, and observance of device functionality against standard (but time- 
consuming) industrial tests such as March X (see subsection 3.2), and ability of the dynamic random 
access memory (DRAM) cells to store data. 

Note that many datasheet parameters require sophisticated test equipment that is not available for this 
work. Hence we are somewhat limited when it comes to testing the datasheet parameters and rely on the 
built-in capabilities of our industrial acceptance tester. This tester can measure many of the parameters, 
but not all. For example, the required structure of the clock signal for the DDR2 memory includes many 
precise timings and signal sculpture requirements, of which only a small amount can be tested directly 
with our equipment. 

DRAM cells store data by charging up a storage capacitor, then periodically refreshing it - a procedure 
that require reading the storage capacitor to determine what charge is supposed to be stored on it. As long 
as the capacitor has enough charge remaining, the circuit can reliably determine to what value it should be 
refreshed. Measuring the ability of the DRAM cells to store data essentially comes down to determining 
the leakage current of the individual cells as a function of various parameters. The primary parameters 
that affects leakage current is the operating temperature of the devices, with the activation energy for the 
leakage path (E a ) typically being such that the current doubles for every 10°C. 

We have also observed pattern sensitivity of DRAM cells. This is expected, as DRAM cells share bit and 
word lines with neighboring cells in various ways. The cells are also coupled to any other bits physically 
near. In order to examine pattern sensitivity, we use a set of different patterns as stimuli for the DRAM 
cells. 


4 



3.0 TEST PLAN 

This section discusses the test plan designed and carried out for determining the general reliability of 
devices and for identifying outlier devices based on ling-time-frame characterization that manufacturers 
generally cannot do on a device-by-device basis. The basic test plan is the following: 

1 . Determine general quality of devices through acceptance-type testing 

2. Obtain data on operating range against most common variable parameters 

3. Obtain cell retention data for all cells with several different test patterns 

4. (Optional) If appropriate, use accelerated life testing to determine if out-of-family devices are 
susceptible to early failure 

3.1 Basic Verification of IDD (Acceptance Testing) 

Parametric measurements on DDR2 devices are important for assessment of reliability. Datasheets show a 
very large number of parameters that can be measured. This includes everything from input capacitance to 
the structure of the clock. However, as indicated earlier, the majority of these parameters cannot be 
measured with the resources available to this task in the quantity or detail required. We have determined 
that the most appropriate parametric studies that can be performed on DIMMs are to measure the standard 
datasheet IDD values, verify functionality across different data patterns, measure the time -dependent 
nature of the storage cells, and attempt to correlate initial outliers with reduced overall life performance. 

In a DIMM, IDD values are combined from multiple devices. The IDD values will be extracted using the 
Eureka 2 tester. The measurement descriptions listed in Table 3.3-1 are those extracted by the Eureka 2 
tester. Values in Table 3.3-1 represent the manufacturer’s specification for individual devices. Because 
the sum of currents drawn from multiple devices may obscure a high IDD draw from a particular bad 
device, this test is only a general way of assessing the overall behavior of the DIMM components and 
may miss a high-current device. For flight we would recommend determining the IDD values for 
individual components. 


Table 3.1-1. IDD values measurable by Eureka 2 system and their specification for individual devices in DIMMs [3-5]. MT/s 
refers to million transfers per section. CL refers to column address select (CAS) latency. 




Specification (mA, at 800 MT/s, CL = 6) 

IDD Item 

Description 

Micron 

Samsung 

Hynix 

IDDO 

Operating one bank active-precharge current 

65 

45 

75 

IDD1 

Operating one bank active-read-precharge current 

75 

51 

85 

IDD2P 

Precharge power-down current 

7 

10 

10 

IDD2Q 

Precharge quiet standby current 

24 

20 

32 

IDD2N 

Precharge standby current 

28 

25 

45 

IDD3P 

Active power-down current 

20 

23 

25 

IDD3N 

Active standby current 

33 

37 

55 

IDD4W 

Operating burst write current 

125 

72 

170 

IDD4R 

Operating burst read current 

120 

80 

160 

IDD5 

Burst refresh current 

145 

105 

170 

IDD6 

Self-refresh current 

7 

10 

10 

IDD7 

Operating bank interleave read current 

210 

160 

230 
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We also used the Eureka 2 tester to provide information about the voltage and frequency space in which 
devices function nominally. This is extracted by obtaining shmoo plots 1 of the voltage and frequency 
space with a given device functionality test, which determines if the device performs successfully. 

Additional parametrics are specified in the manufacturer’s datasheet. These are standard operating 
voltages and currents: leakage currents on all pins, output driver strength, logic high and low values, edge 
timing, and other items. Note, however, we have determined that this type of general reliability study 
would be require significant resources, and is not believed to improve the information known beyond the 
IDD measurement and shmoo scanning. 

3.2 Shmoo Testing of Device Operating Area 

Shmoo testing was performed on the test DIMMs with voltage and operating frequency varied to 
determine the area in which DIMMs would perform reliably. The verification of proper operation was 
based on successfully passing a “March X” test on the entire DIMM at the given voltage and operating 
frequency. 

The March X test is a write and read test on a memory component. There are essentially four steps. The 
steps are the following. 

1 . Write 0’s to the memory using an increasing address counter. 

2. Read 0, then write 1 at each address using an increasing address counter. 

3. Read 1, then write 0 at each address using a decreasing address counter. 

4. Read 0 at each address using a decreasing address counter. 

The voltage range selected for this work was from 1.5 V to 2.5 V, in increments of 0.1 V. Note that we 
believe there is on-chip regulation that limits the effectiveness of high voltage testing in actually 
achieving an altered state in the device. The DDR2 devices are specified to operate with a voltage in the 
range of 1.7 V to 1.9 V, and thus, this test is significantly outside of the normal operating voltage on both 
ends. 

The frequency range for the shmoo sweep is somewhat more problematic. The DIMMs are based on 
400-MHz devices, but one set of DIMMs (Micron) does not meet this operational speed and is 
intentionally de -rated and unable to properly operate at 400 MHz. For this reason, the shmoo testing uses 
a couple frequency ranges (only one used for any given DIMM). The ranges are given below. All ranges 
use frequency steps of 10 MHz. 

1. Lower frequency range 1: 250-420 MHz (10 MHz steps) 

2. Lower frequency range 2: 300-420 MHz 

3. Higher frequency range 3: 300^450 MHz 

3.3 Determination of Data Pattern Retention 

Our test plan includes significant effort to determine if devices under test (DUTs) are sensitive to the data 
pattern stored in the cells. The reason this is a focus is because of observations in flight programs where 
flaky bits (flaky is a jargon term referring to bits that sometimes but not always have difficulty holding 
stored data) were observed after launch and insufficient initial characterization was performed on the 


1 A shmoo plot is a graphical display of the response of a component or system varying over a range of conditions 
and inputs. Often used to represent the results of the testing of complex electronic systems such as computers or 
integrated circuits such as DRAMs or microprocessors. The plot usually shows the range of conditions in which the 
device under test operates (in adherence with some remaining set of specifications). 
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devices. Consequently, there is incomplete knowledge of the quality of the questionable bits before 
launch, and the team is forced to acknowledge that it is unknown whether observations in flight are of 
existing or new conditions. 

3.3.1 Data Retention 

The key to our approach is to determine the characteristic storage time of the DRAM cells. This is done 
by slowing down the data refresh. The primary DRAM cell structure, in Figure 3.3.1 1, consists of a 
storage capacitor, which is isolated from the system by an access transistor. This is a constantly decaying 
system (tending toward voltage at the common collector (Vdd)/2 on the storage capacitor). The cell is 
read by activating the access transistor and observing the transient current pulse from the capacitor. If the 
charge stored in the capacitor is large enough, then the circuit’s sense amplifier determines the correct 
value and forces the bit line to the observed value. If the charge in the capacitor is too low, the sense 
amplifier drives the line to its default state (which is dependent on many factors and will tend to be 
opposite voltage on different cells). 



The storage properties of the cells are not as simple as presented above because the cells are part of a 
meshwork of billions of cells and non-trivial connections. All of the attributes of each bit can contribute 
to its intrinsic leakage resistance. This includes the voltages present on neighboring cells, which can 
affect the local bias of the bit or word lines. Thus, the problem is that each cell has its own properties 
(likely in a very tight population distribution), and its response depends on the charge it holds and the 
charges present in its neighbors. This can result in cells that lose their data quickly (-1 second at 23 °C) to 
those that can hold their data for a day or two, as shown here and in [1]. And in the event the pattern used 
to test the cell corresponds to its intrinsic value when discharged (the value the sense amplifier assigns 
when no charge pulse is observed), then the cell will never be observed to lose its stored data. 

Through previous work we determined that use of single test patterns can result in imprinting of the 
pattern into the memory [6]. This was observed during temperature- and voltage-accelerated life testing 
and it is not known if the observation would carry over to normal use at nominal temperature and voltage. 

We used this understanding of the behavior of the DRAM cells to determine a multi-pattern, multi- 
temperature approach to testing cell retention time. 

3.3.2 Test Plan for Retention 

The test plan calls for determination of data retention of the DRAM cells in the DIMMs. For each set of 
test conditions we write a known pattern to the DIMM, wait the appropriate time delay (without 
refreshing the DIMM), then read out the data and determine the fraction of bits that have lost their data. 
The testing was conducted using the test matrix indicated in Tables 3.3-1 through 3.3-3: 
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Table 3.3-1. Temperature portion of test matrix. 


Condition 

Test Temperature (°C) 

Condition 1 

40 

Condition 2 

85 


Table 3.3-2. Data pattern portion of test matrix. 


Condition 

Test Data Pattern 

Condition 1 

All bits Os 

Condition 2 

All bits 1 s 

Condition 3 

DQ pattern = 0xA5 (A5) 

Condition 4 

Address-based (Addr-Based) 

Condition 5 

Address-based, inverted (Addr-Based#) 

Condition 6 

Pseudo-random pattern A (Random-A) 

Condition 7 

Pseudo-random pattern A, inverted (Random-A#) 

Condition 8 

Pseudo-random pattern B, (Random-B) 

Condition 9 

Pseudo-random pattern B, inverted (Random-B#) 


Table 3.3-3. Refresh delay portion of test matrix. 


Condition 

Test Data Pattern 

Condition 1 

64 ms 

Condition 2 

128 ms 

Condition 3 

256 ms 

Condition 4 

512 ms 

Condition 5 

1.02 s 

Condition 6 

2.04 s 

Condition 7 

4.08 s 

Condition 8 

8.19s 

Condition 9 

16.4 s 

Condition 10 

32.8 s 

Condition 1 1 

1 min 5.5 s 

Condition 12 

2 min 22 s 

Condition 13 

4 min 22 s 

Condition 14 

8 min 45 s 

Condition 15 

17 min 30 s 

Condition 16 

35 min 

Condition 17 

1 hr 10 min 

Condition 18 (not always useful) 

2 hr 20 min 

Condition 19 (not always useful) 

4 hr 40 min 

Condition 20 (not always useful) 

9 hr 20 min 
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3.3.3 Presentation of Data Pattern Retention Data 

Because retention measurements are not necessarily standard, we present an example here. Figure 3.3-1 
below shows a typical retention measurement. The device is loaded with a pattern; then refresh is disabled 
for a specified period of time (x-axis); and after refresh is re-enabled the device is read and the number of 
bits that have lost data is recorded. The fraction of bits that are bad is used to determine the y-value of 
each point. Note that the final data point (at -30,000 seconds) corresponds to about 8 hours. 



Figure 3.3-1. Retention measurements for 9 DDR2 devices, taken at room temperature. The x-axis is time in seconds between 
full refresh cycles of the device. The y-axis is the fraction of bits that failed. 

For this year we increased the amount of data that is collected, and we had to develop an improved 
graphing approach to display the data. The figure below is an example of a two-dimensional histogram 
that is intended to show the data from all of the components on all the DIMMs in a data set at one time. 
Figure 3.3-2 shows an example of this method of data presentation. The left panel uses color for the 
height of the bins, while the right panel uses bar height. Note, these represent the same data. The time 
shown is logarithmic, with 0 being 64 ms, and each following bin being approximately twice as long (so 
that the 17 entry is a little over an hour). The fraction of the device that has lost data is presented across 
the front. And the height or color of each entry indicates how many of the devices fall in the given bin. 
The data would be expected to only have one main clump at each timing, indicating one population. (Note 
that towards the left side the bars for less than le-8 failure fraction are somewhat discrete and this 
behavior should be ignored. Similarly, we show zero errors as the right edge at le-10, and these should be 
largely ignored.) However, the data shown indicate two subpopulations. The left panel clearly shows a 
single device with a few bad bits even when operating the device at the required refresh rate. It also shows 
one DIMM’s worth of devices that did not function properly throughout the retention scan (the vertical 
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band on the right side). Note that when presented, these graphs will include the temperature of the scan 
and the pattern used. These are clear across the top of the left panel. 

Temperature -40 Stan^ Random A# 




Figure 3.3-2. Example of a two-dimensional histogram of all components from all DIMMs. The time is bottom to top (top chart) 
and front to back (bottom chart). The fraction of bits remaining is across the front, and the height/color of the bars indicates the 

number of devices in the given bin. 
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3.4 Accelerated Life Testing 

This part of the test plan is optional because it may be of very limited value. We are interested in 
understanding how the components may fail. However, as was observed in FY11 testing, it is unlikely 
that any failure mechanism will be triggered in a nominal 1000-hour accelerated life test [1]. During the 
earlier testing the only failures were complete device non-functionality. 

In the approach developed for this test plan, we anticipated outlier devices may be identified. The outliers 
may be subject to accelerated life testing to observe if the devices should be removed from consideration 
for flight use due to reduced reliability. Thus far, we have not identified sufficient candidates from which 
to select devices for this type of testing. Upon completion of all retention scans, this type of testing will 
be pursued depending on the suitability of identified outliers. 

For accelerated life testing, the maximum datasheet parameters will be used — 1.9 V and 85°C (low 
temperature is not used at this time due to testing difficulties involved in maintaining the DUT at -40°C 
for multiple weeks). These were chosen because the earlier work indicated the device behavior when 
operated outside of the datasheet requirements results in non-interpretable results. Because the 
acceleration parameters are not as large as would be desired (i.e., 125°C), it will probably be necessary to 
go beyond 1000 hours to observe changes in operation and failures. 
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4.0 TEST HARDWARE 

This section discusses the key test hardware of this task and work done to improve the hardware systems 
during FY13. We present details on the DDR2 test devices chosen for our reliability work. This is 
followed by a brief review of the basic test hardware used. We then present information about hardware 
updates. And the section concludes with a brief discussion of the environmental chambers used for this 
testing. 

4.1 Target Devices 

From earlier work on this NEPP task, it was established that DIMMs are the most cost-effective way to 
obtain hundreds of components for testing. Samsung, Micron, and Hynix 2GB DIMMs were the focus of 
the FY13 work. The study DIMMs were produced using 16 1-Gb devices. Each device type was obtained 
in a set of 10 DIMMs, totaling 160 DDR2 devices for each manufacturer (they are two-rank unregistered 
DDR2 DIMMs). All test devices have 14 row bits, 10 column bits, and 3 bank bits (8 banks). Devices all 
have an 8-bit data word. Device details are given in Table 4.1-1. 

Table 4.1-1. IDD values measurable by Eureka 2 system and their specification for individual devices in DIMMs [3-5]. 


Manufacturer Part Number Device Photo Number Feature 

of Parts Size 


Micron 

MT47H 1 28M8CF-25:H 
[3] 


! 


160 

50 nm 

Samsung 

K4T1G084QF [4] 




160 

5x nm 

Hynix 

H5PS1G83EFR-S6C [5] 




160 

5x nm 
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4.2 Base Test Hardware 

We used hardware developed under the FY12 testing, expanded to enable testing of more devices. In this 
section we discuss the basic test hardware used for this test. This hardware comes from two specific test 
systems. The first is the Eureka II DDR2 tester. The second is a Xilinx Virtex 4 evaluation board, which 
is designated the Modular Digital Test System (MDTS) Prototype Board 3b (MPB3b). 

4.2.1 Eureka II 

The Eureka II tester has been used by this NEPP task since FY12 as a method to provide industry- 
standard test capability. The tester is shown in Figure 4.2-1. It consists of a test unit that connects by 
Universal Serial Bus (USB) to a test computer. The tester includes an interface that enables connection of 
different test heads that can support DDR2 and DDR3 DIMMs. 



Figure 4.2-1. Eureka II test system. The test head can be changed to enable testing of DDR2 or DDR3 devices. 

The Eureka II test system is used in the same way that it was used for the FY12 testing. That is, we 
configured the Eureka II test system to perform several standard parametric tests of test devices and to 
perform shmoo testing of device capability. This system enables us to ensure that standard capability is 
verified on the test devices and that other test systems used are in-line with standard device operation. 

4.2.2 MPB3b-Based Test System 

Because of the general structure of the testing to be conducted requires operation of many devices, we 
have also built a device functional tester out of a prototyping board. This approach enables building 
multiple test units and operating many devices in parallel. This base-board was introduced earlier in this 
NASA Electronic Parts and Packaging (NEPP) task, and in FY12 the system was modified to support 
limited DIMM testing capability, with the test system shown in Figure 4.2-2. 

4.2.3 General Test Hardware 

The hardware setup for using the test setup described above is shown in Figure 4.2-3 for a multiple 
DIMM setup with nine DIMMs operated simultaneously in environmental chambers. 
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Figure 4.2-2. MPB3b-based test system as developed in FY12. This system was expanded to include multiple boards that will 

enable many devices to be tested simultaneously. 
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Figure 4.2-3. The setup of the motherboards with environmental chambers is shown. 
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The test system above is known as the DDR2 Reliability Tester (D2RT). The entire system consists of the 
motherboard (MPB3b), the mezzanine card (Mezzanine Card ‘C’ - MCC), the power units required to 
supply the MPB3b and MCC, the Opal Kelly USB communications card [7], and the operations 
computer. 

4.3 Test Hardware Development 

During FY13 test hardware development focused on two areas. First, a problem with power distribution 
that made the D2RT system unstable was resolved. Second, an update of the MCC was developed to 
improve the reliability of DIMMs operated in the environmental chambers. 

4.3. 1 Upgraded Power Delivery 

The DDR2 instantiation we used for the D2RT requires the use of termination resistors. The total amount 
of resistors required results in a static power draw on the termination power supply (VTT) of roughly 4 A. 
This current level is high enough to be taxing for most DUT power supplies. The voltage is supplied on- 
board by a power regulator. This power regulator’s power-up behavior was unstable in the original design 
for the MCC, requiring the test operator to massage the power circuit to achieve good power supply 
behavior (once reliable operation was established it was never observed to degrade). We decided to 
improve the overall performance of the power system by providing the supply bias for the termination 
through a high current local power regulator (isolated from the DUT), and provides the input/output (I/O) 
voltage for the field-programmable gate array (FPGA) I/O banks communicating with the DUT. And 
during this work, we identified the cause of the unstable turn-on behavior, which was determined to be 
the result of high in-rush current being delivered by the termination regulator. The power unit can be seen 
in Figure 4.3-1. 



Figure 4.3-1. Power regulators developed to enable non-DUT power to be offloaded from the DUT power supply. 
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4.3.2 Upgraded MCC 

The MCC developed for the initial verification of DIMM operation and initial gathering of retention data 
was found to have a few flaws. A couple of those flaws limited the maximum clock speed. Furthermore, 
the resulting repairs resulted in fragile “haywires” that are easily damaged due to the mounting of the 
MCC through the environmental chamber doors. We collected all the flaws in the original MCC and 
developed a new revision. The layout of the MCC rev 1 board is shown below in Figure 4.3-2. 

Note that the revised MCC has a bayonet connector (BNC) jack to allow power to be delivered to the 
DUT alternately from the new power system described above or from a dedicated power supply for the 
DUT. In the majority of functional testing situations where the current is not monitored, this power 
supply approach will greatly simplify the test setup. 

4.4 Test Firmware Development and Implementation 

The updated MCC required modifications to the original DDR2 DIMM firmware used on the first version 
of the MCC. This work resulted in improved overall ability to debug the MCC revision A card. 
Development targeted the ability to observe interface signals during debugging and key details about the 
positioning of signals relative to the DIMM clock. We also updated the design to enable the use of Xilinx 
debugging tools such as Chipscope. 
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5.0 TEST RESULTS 

Three primary sets of data come out of the reliability testing based on the test plan. The first is the set of 
IDD measurements for all DIMMs. The second is the shmoo plots of each DIMM’s ability to successfully 
pass the March X test. And the third set of test results covers the retention scans. 

5.1 Test Summary 

This section briefly highlights the findings of the testing of DDR2 DIMMs for this year. We currently 
have data collected and analyzed for both IDD and shmoo testing of all test devices. For retention plots, 
we have completed analysis of the Hynix devices, but are still in-process on the Micron and Samsung 
retention scans. 

The testing is summarized below in Table 5.1-1 and Table 5.1-2. For the retention scans, we are ignoring 
the problems associated with port two of the test system, which would sometimes result in an entire scan 
for an entire DIMM being corrupt. 


Table 5.1-1. Summary of test results for all DIMMs for IDD and shmoo testing. 


DUT 

IDD Testing 

Shmoo Testing 

Hynix -HI 2_1 

Nominal 

Nominal 

HI 2_2 

Nominal 

Nominal 

HI 2_3 

Nominal 

Minor Difference 

HI 2_4 

Nominal 

Nominal 

HI 2_5 

Nominal 

Nominal 

HI 2_6 

Nominal 

Minor Difference 

HI 2_7 

Nominal 

Nominal 

HI 2_8 

Nominal 

Minor Difference 

HI 2_9 

Nominal 

Nominal 

Micron — M12_1 

Nominal 

Nominal 

Ml 2_2 

Nominal 

Nominal 

Ml 2_3 

Nominal 

Nominal 

Ml 2_4 

Nominal 

Nominal 

Ml 2_5 

Nominal 

Nominal 

Ml 2_6 

Nominal 

Nominal 

Ml 2_7 

Nominal 

Nominal 

Ml 2_8 

Nominal 

Nominal 

Ml 2_9 

Nominal 

Nominal 

Samsung - SI 2_1 

Nominal 

Nominal 

SI 2_2 

Nominal 

Nominal 

S12J3 

Nominal 

Nominal 

SI 2_4 

Nominal 

Nominal 

SI 2_5 

Nominal 

Nominal 

SI 2_6 

Nominal 

Nominal 

SI 2_7 

Nominal 

Nominal 

SI 2_8 

Nominal 

Nominal 

SI 2_9 

Nominal 

Nominal 
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Table 5.1-2. Summary of test results for Hynix DIMMs for retention scan testing. 


Test Condition 

Result at 40°C 

Result at 850C 

Allis 

All devices nominal - no errors 

All devices nominal - no errors 

All Os 

All devices nominal - no errors 

All devices nominal - no errors 

A5 Pattern 

All devices nominal - no errors 

All devices nominal - no errors 

Addr-Based 

All devices nominal - no errors 

Two devices show errors 

Addr-Based# 

All devices nominal - no errors 

Two devices show errors 

Random A 

One device shows errors 

One device shows errors 

Random A# 

One device shows errors 

One device shows errors 

Random B 

One device shows errors 

Two devices show errors 

Random B# 

One device shows errors 

Two devices show errors 


The minor differences in the retention scans of the Hynix devices are due to some failures of the March X 
test in the -400 MHz bins, when the voltage was above the maximum operating voltage. For most devices 
there was a -0.3V high region where the DIMMs worked, but for three DIMMs this area was truncated. 

For the retention scans, the Addr-Based pattern showed significantly worse performance (on two devices 
only) at 85°C, compared to 40°C, but all other devices changed as expected. The Random A patterns 
produced one poorly operating device that appeared to have more problems at 85 °C, but generally the 
response was as expected. The Random B patterns were similar to Random A at 40°C and the general 
behavior stayed the same when going to 85°C (in contrast to the Random A behavior), except that for 
Random B, it appears a second device starts having errors at 85°C. 

5.2 IDD Scans 

The results of IDD scans taken at a clock rate of 400 MHz (data rate of 800 MT/s) is given in this section. 
Testing revealed that all the DIMMs function nominally, but all DIMMs show variation on the IDD4W 
and IDD4R tests. We don’t have an explanation for this behavior but are not sure it indicates a real 
difference in devices — it is believed, instead, that it may be difficult to obtain a good current reading 
when performing these tests and results may indicate the level of uncertainty in that measurement. 

5.2.1 Hynix 

The IDD scans for the Hynix DIMMs are shown in Table 5.2-1. This table shows that the current for all 
operations is below that of eight devices performing the given IDD test and eight devices in the standby 
state. (Here standby is less than 10 mA/device.) Note that a fair amount of variation is observed in 
IDD4W and IDD4R. Also note that the operating currents for the Hynix parts are considerably higher 
than for the Micron and Samsung parts discussed later in this section. 
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Table 5.2-1. The IDD performance of the 9 Hynix DIMMs. Note that all measurements are within the datasheet maximums for 
eight operating devices and eight standby devices (all measurements are in mA). 



Datasheet 

Spec 











1 

Part 

8 Parts 
+ stdby 

H12_1 

H12_2 

H12_3 

H12_4 

H12_5 

H12_6 

H12_7 

H12_8 

H12_9 

IDDO 

75 

680 

378 

376 

369 

375 

378 

375 

378 

375 

371 

IDD1 

85 

760 

457 

447 

437 

447 

431 

439 

445 

443 

437 

IDD2P 

10 

160 

66 

66 

65 

66 

67 

66 

66 

66 

64 

IDD2Q 

32 

336 

176 

175 

172 

202 

178 

175 

177 

173 

171 

IDD2N 

45 

440 

174 

172 

170 

172 

175 

172 

174 

170 

168 

IDD3P 

25 

280 

62 

64 

60 

64 

64 

62 

64 

62 

60 

IDD3N 

55 

520 

544 

541 

533 

535 

542 

533 

544 

539 

529 

IDD4W 

170 

1440 

533 

425 

517 

414 

541 

531 

521 

519 

400 

IDD4R 

160 

1360 

1310 

1281 

1287 

1173 

1308 

1146 

1259 

1269 

1322 

IDD5 

170 

1440 

847 

851 

833 

835 

847 

835 

841 

839 

830 

IDD6 

10 

160 

37 

37 

37 

37 

37 

36 

37 

37 

35 

IDD7 

230 

1920 

427 

441 

429 

427 

439 

433 

431 

437 

421 


5 . 2.2 Micron 

The IDD scans for the Samsung DIMMs are shown in Table 5.2-2. This table shows that the current for 
all operations is below that of eight devices performing the given IDD test and eight devices in the 
standby state. (Here standby is less than 7 mA/device.) Note that a fair amount of variation is observed in 
IDD4W and IDD4R. 

Table 5.2-2. The IDD performance of the 9 Micron DIMMs. Note that all measurements are within the datasheet maximums for 
eight operating devices and eight standby devices (all measurements are in mA). 



Datasheet Spec 











1 Part 

8 Parts 
+ stdby 

M12_1 

M12_2 

M12_3 

M12_4 

M12_5 

M12_6 

M12_7 

M12_8 

M12_9 

IDDO 

65 

576 

248 

240 

234 

234 

234 

240 

234 

236 

234 

IDD1 

75 

656 

347 

343 

333 

335 

333 

337 

332 

335 

333 

IDD2P 

7 

112 

49 

49 

46 

47 

46 

49 

46 

46 

46 

IDD2Q 

24 

248 

132 

124 

120 

121 

120 

123 

120 

121 

121 

IDD2N 

28 

280 

131 

123 

119 

121 

120 

122 

120 

121 

119 

IDD3P 

20 

216 

46 

46 

44 

44 

42 

46 

44 

42 

42 

IDD3N 

33 

320 

386 

359 

353 

357 

355 

361 

353 

357 

355 

IDD4W 

125 

1056 

351 

289 

291 

259 

261 

296 

314 

259 

269 

IDD4R 

120 

1016 

328 

507 

343 

408 

498 

281 

479 

308 

476 

IDD5 

145 

1216 

732 

720 

712 

722 

707 

710 

705 

710 

707 

IDD6 

7 

112 

40 

41 

38 

38 

38 

40 

38 

37 

38 

IDD7 

210 

1736 

353 

347 

343 

345 

345 

347 

343 

345 

343 


5 . 2.3 Samsung 

The IDD scans for the Samsung DIMMs are shown in Table 5.2-3. This table shows that the current for 
all operations is below that of eight devices performing the given IDD test and eight devices in the 
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standby state. (Here standby is less than 10 mA/device.) Note that a fair amount of variation is observed 
in IDD4W and IDD4R.) 

Table 5.2-3. The IDD performance of the 9 Samsung DIMMs. Note that all measurements are within the datasheet maximums 
for eight operating devices and eight standby devices (all measurements are in mA). 



Datasheet Spec 











1 Part 

8 Parts 
+ stdby 

S12_1 

S12_2 

S12_3 

S12_4 

S12_5 

S12_6 

S12_7 

S12_8 

S12_9 

IDDO 

45 

440 

208 

210 

210 

210 

212 

214 

212 

210 

210 

IDD1 

51 

488 

291 

291 

291 

291 

289 

292 

292 

289 

292 

IDD2P 

10 

160 

37 

38 

38 

38 

38 

39 

39 

38 

38 

IDD2Q 

20 

240 

136 

136 

137 

138 

137 

139 

139 

137 

138 

IDD2N 

25 

280 

135 

132 

137 

137 

137 

138 

138 

136 

137 

IDD3P 

23 

264 

35 

35 

35 

35 

37 

37 

35 

35 

35 

IDD3N 

37 

376 

308 

312 

312 

310 

310 

213 

316 

310 

310 

IDD4W 

72 

656 

361 

302 

369 

367 

398 

367 

371 

296 

373 

IDD4R 

80 

720 

474 

560 

306 

302 

304 

503 

550 

455 

511 

IDD5 

105 

920 

546 

546 

548 

548 

552 

560 

554 

552 

548 

IDD6 

10 

160 

39 

40 

40 

40 

40 

41 

41 

40 

40 

IDD7 

160 

1360 

285 

285 

285 

287 

283 

287 

287 

283 

287 


5.3 Shmoo Plots 

In this section, the shmoo plots obtained for operating voltage versus frequency response to the March X 
test are presented. 

5.3. 1 Shmoo Plots for Hynix DIMMs 

Shmoo plots for the Hynix DIMMs are given in Figures 5. 3. 1-5. 3. 1.9. Most of the DIMMs show 
essentially the same operating area (green region). The exceptions are DIMMs 3, 6, and 8, which 
apparently have reduced functionality at off-datasheet voltage and high frequency. 
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Figure 5.3-1. Shmoo response of Hynix H12_1 . Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-2. Shmoo response of Hynix H12_2. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Figure 5.3-3. Shmoo response of Hynix H12_3. Green indicates the DIMM passed the March X test at that voltage and 
frequency. This device appears to have reduced operating area at high frequency and voltage. 



Figure 5.3-4. Shmoo response of Hynix H12_4. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Figure 5.3-5. Shmoo response of Hynix H12_5. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-6. Shmoo response of Hynix H12_6. Green indicates the DIMM passed the March X test at that voltage and 
frequency. This device appears to have reduced operating area at high frequency and voltage. 
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Figure 5.3-7. Shmoo response of Hynix H12_7. Green indicates the DIMM passed the March X test at that voltage and 



frequency. 


Figure 5.3-8. Shmoo response of Hynix H12_8. Green indicates the DIMM passed the March X test at that voltage and 
frequency. This device appears to have reduced operating area at high frequency and voltage. 
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Shmoo Plot Test Result 



Figure 5.3-9. Shmoo response of Hynix H12_9. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 


5 . 3.2 Shmoo Plots for Micron DIMMs 

Shmoo plots for the Micron DIMMs are given in Figures 5.3-10-5.3-18. Most of the DIMMs show 
essentially the same operating area (green region). Note that although the components on the DIMM are 
400-MHz devices, the operating area indicated in the shmoo plots clearly shows these DIMMs are not 
fully functional at 400 MHz. This is likely due to the design of the DIMM. The only real difference 
between these plots is the behavior at the 400 MHz bins. 



Figure 5.3-10. Shmoo response of Micron M12_1. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Shmoo Plot Test Result 
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Figure 5.3-11. Shmoo response of Micron M12_2. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-12. Shmoo response of Micron M12_3. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Shmoo Plot Test Result 



Figure 5.3-13. Shmoo response of Micron M12_4. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-14. Shmoo response of Micron M12_5. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Figure 5.3-15. Shmoo response of Micron M12_6. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-16. Shmoo response of Micron M12_7. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Figure 5.3-17. Shmoo response of Micron M12_8. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-18. Shmoo response of Micron M12_9. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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5.3.3 Shmoo Plots for Samsung DIMMs 

Shmoo plots for the Samsung DIMMs are given in Figures 5.3-19-5.3-27. Most of the DIMMs show 
essentially the same operating area (green region) - only a few fail/pass boxes are different in any given 
plot. 



Figure 5.3-19. Shmoo response of Samsung S12_1 . Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-20. Shmoo response of Samsung S12_2. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Figure 5.3-21. Shmoo response of Samsung S12_3. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-22. Shmoo response of Samsung S12_4. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Shmoo Plot Test Result 
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Figure 5.3-23. Shmoo response of Samsung S12_5. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-24. Shmoo response of Samsung S12_6. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Figure 5.3-25. Shmoo response of Samsung S12_7. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 



Figure 5.3-26. Shmoo response of Samsung S12_8. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 
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Figure 5.3-27. Shmoo response of Samsung S12_9. Green indicates the DIMM passed the March X test at that voltage and 

frequency. 


5.4 Retention Scans 

This section provides the retention scan results for the Hynix DIMMs. The results are presented as a set of 
two-dimensional histograms for each test condition. The height or color of each histogram entry 
corresponds to how many of the DDR2 components had their response fall into the given data loss 
fraction at each retention value. See Section 3.3.3 for more information on the presentation of this data. 
We present here the results for the all Os scan at 40 and 85°C in Figures 5.4-1 through 5.4-4. These 
figures show that the change from 40 to 85°C results in a shift of the histogram by four bins, with the first 
deviation from all devices fully working occurring in the 5 -bin at 40°C, and in the 1-bin at 85°C. This is 
consistent with the expectation that the cells lose charge twice as fast for every 10°C increase (so data at 
85°C would be expected to move to shorter retention time by about four slots compared to 40°C). 

In Figure 5.4-5 we show the two-dimensional histograms for all of the remaining conditions. One 
common observation, starting in Figure 5.4-5, upper right, is a band of eight or sixteen components (blue 
and green colors) where one of the DIMMs did not communicate properly during testing. This behavior 
only occurred on DIMMs connected to the mezzanine card connected to port two of the test system and is 
believed to be related to the way these cards had to be rewired to work correctly (leaving port two to be 
sometimes unreliable). 
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Temperature=40 Scan=AII O's 



Figure 5.4-1. Resulting two-dimensional histogram of 144 Hynix DDR2 devices at 40°C, using an all Os pattern. 

Temperature =40 kan=AII O's 



Figure 5.4-2. Three-dimensional representation of the two dimensional (2-d) histogram in Figure 5.4-1 . 
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Figure 5.4-3. Two-dimensional histogram for the all Os scan at 85°C. Note that the devices are on the borderline even close to 
the datasheet specification of 64 ns (the O-position on the vertical axis). 
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Figure 5.4-4. Three-dimensional representation of the 2-d histogram in Figure 5.4-3. 


37 


Temperature=40 Scan=AII l's 
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Figure 5.4-5. Two-dimensional histograms for Hynix components for the all Is (top left), A5 (top right), Address-Based (bottom 
left), and Address-Based# (bottom right) patterns. Note that the A5 scan is the first one that shows a problem where one of the 

DIMMs had problems on one of the test systems. 

Figure 5.4-6 and Figure 5.4-7 show the first scan with a random pattern used. This scan shows that one 
device has around 100 bits that have trouble storing the data pattern, which shows up as the bin with a 
count of “1” between 10“ 8 and 10“ 7 . This is one out of 144 devices, which is about the level we expected 
to see outlier devices (though we could not have predicted this exact behavior). Note that this device did 
pass the Os, Is, address-based, and A5 pattern scans with no problems (and shown later will pass the Os, 
Is, and A5 scans at 85°C (which was tested after the 40°C scans). Further, this device passed the IDD 
tests and the march tests conducted in the more standard industrial testing discussed in Sections 5.2 and 
5.3. The green bin in the time 0 scan indicates a problem where sometimes the first scan of a retention 
scan results in a single read-or-write error resulting in about 32 bad bits. This was seen on multiple 
devices that then operated with no errors during later scans so it is believed to be a test artifact. The strip 
corresponding to about 50% of bits being bad corresponds to the port 2 problem discussed above. Figures 
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5.4-6 through 5.4-8 show this random pattern impact on the Random A#, Random B, and Random B# 
patterns. 


Temperature=40 Scan=Random A 



Figure 5.4-6-4. The first random scan at 40°C shows a behavior seen in all random scans. Here there is one device that has bits 
in error. Note that the green point in the 0-bin is likely due to a switchover error where the first scan in with a new pattern has a 
single read-or-write error that results in a burst access with bad data. 
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Figure 5.4-7. Three-dimensional representation of data from Figure 5.4-6. Note that the time 0 scan is affecting multiple devices 
and doesn't repeat after the initial time 0 scan (the longer retention scans are done after the time 0 scan). 
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Temperature=40 Scan=Random A# 
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Figure 5.4-8. The two-dimensional histograms for the Random A# (inverted) (upper left), Random B (upper right), and Random 

B# (lower). 

Figures 5.4-9 show the nominal behavior of DIMMs when operated at high temperature (85°C), when 
tested against the simple patterns (Os, Is, and A5). Figures 5.4-10 show that at 85°C two devices are 
exhibiting problems, even with the relatively simple address-based pattern (and its inverse). 

Figure 5.4-11 rounds out the retention scans, showing that the random patterns continue to 
highlight one part with problems. There may also be a second device (more clearly seen in the lower left 
panel where two bins have one device each) exhibiting a small number of bad bits. These plots show that 
the 40°C testing is very indicative of the high temperature response here. The general population moves 
as expected. 
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Figure 5.4-9. Two-dimensional histograms for all Os (upper left), all Is (upper right), and the A5 pattern (lower) 


Temperature=85 Scan=Addr-Based Scan' 




Figure 5.3.3 10. Two-dimensional histograms for Addr-Based and Addr-Based# patterns. Note that a couple devices have bits in 
error at this temperature. Note also that the poor-performing DIMMs in port 2 are in between working and not working. 
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Temperature=85 Scan=Random A 


Temperature=85 5can=Random A# 



Figure 5.4-11. Two-dimensional histograms for Random A (upper left), Random A# (upper right), Random B (lower left), and 
Random B# (lower right) are shown. These show the same behavior as for 40°C, except that the Random A tests are slightly 
worse than at 40°C. These plots also indicate there may be a second device exhibiting problems as well. 
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6.0 FUTURE WORK 

This work is expected to continue into FY14. The focus of the work will be twofold. First will be the 
finalization of DDR2 screening. Second will be expansion of the target devices to include DDR3. When 
combined, these will provide significantly increased value to the data collection for DDR2 and DDR3, 
which will provide useful information for flight project use of either of these parts. The use of DDR2 
devices in selected programs suggests that the higher speed and lower cost DDR3 parts will likely be used 
in the near future. Thus, it is important to make the transition to these devices. 

The DDR2 work will entail completing retention measurements of Micron and Samsung parts. As 
indicated above, this will include 40°C and 85°C retention measurements on nine DIMMs. The total 
number of components in the test plan is 144 DDR2 devices from each manufacturer. Given the updated 
DDR2 hardware, it is believed the retention scans can be completed with six or more DIMMs run in 
parallel. 

The DDR3 work in FY14 is planned to include all necessary hardware development to enable 
unregistered DDR3 DIMM testing. We also expect to have initial reliability test data in place during 
FY14 and a robust test schedule to accommodate more than 15 DIMMs (single or double-rank) for IDD 
screening, shmoo testing, and retention scans to be completed within 4 months of starting. 
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7.0 CONCLUSIONS 

The FY13 DDR2 reliability NEPP task has successfully performed testing of DDR2 components from 
three manufacturers. The test approach developed has started to show significant potential benefit to flight 
project users. The approach of identifying outliers and determining the quality of devices against pattern 
sensitivity can improve the overall quality of deployed parts. This work is based largely on reliability 
evidence available from a few key resources and from experience with these parts on flight projects. 
Continued reliability testing, and moving to newer devices such as DDR3, will keep this work relevant 
for current and future projects. 

The test approach used is the following. We couple initial acceptance testing using industrial testers with 
custom long-duration characterization of devices. The approach expects the manufacturing processes to 
ensure that the majority of parts (greater than 98%) are part of the principal population and will be 
essentially indistinguishable over many years of flight use. By performing our additional testing we can 
ensure that devices selected for flight use will be part of the principal population, and that population will 
have no indicators of early failure (within the application of our test results). 

The testing of DDR2 devices for FY13 included initial IDD and shmoo screening of nine DIMMs, each 
with sixteen DDR2 components, from three manufacturers. One manufacturer’s DIMMs, Hynix, were 
also tested for pattern and temperature sensitivity of DRAM cell retention. The IDD testing showed that 
all devices were similar, with the only significant variation being in the IDD4R and IDD4W currents, 
which all manufacturers show and is believed to be a byproduct of the tester. The shmoo testing showed 
that all DIMMs work well at the standard operating speeds of 333 MHz and 400 MHz (with the exception 
that the Micron DIMMs were not configured to support 400 MHz operation). The cell retention 
measurements for the Hynix DIMMs highlighted a few key observations. First, we found one outlier 
device that had a few bad bits when tested with random patterns (at all temperatures). We also found 
outlier devices that had trouble with address-based patterns at 85°C. The final observation from the 
retention scans is that some DIMMs have trouble operationally, for various patterns, which is likely due 
to reliability of the test boards and is the primary reason why an updated board was developed this year. 

Because of the nature of the outlier behavior (bad bits even at the fastest refresh rate), we do not believe it 
is appropriate to perform life testing on the identified outlier devices found thus far. That is, the candidate 
devices would have been rejected during screening since they cannot reliably store data, and therefore any 
reduced reliability is irrelevant for flight projects. 

In the event that outliers are found that do not impact the operation of a flight system, it still makes sense 
to plan to test these for life testing. Life testing can show if the devices that are not in the main population 
may actually have reduced reliability when deployed. 

DDR2 devices have been the primary focus recently because of known flight project use of the devices. 
There was a break between SDRAM and DDR2 where the original DDR devices were not used, 
presumably because the power draw of the devices was simply too high to be used. Conversely, DDR3 is 
already in the development plans for flight projects, and thus, we will be working on similar reliability 
testing of DDR3 devices in the near future. 
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9.0 APPENDIX A. ACRONYMS AND ABBREVIATIONS 

2- d two dimensional 

3- d three dimensional 
ADC address, data, and control 
Addr address 

BNC bayonet connector 

CAS column address select - a control signal of the DDR2 interface 
CE correctable error 
CL CAS Latency 

CMOS complementary metal oxide semiconductor 

D2RT DDR2 Reliability Tester 

DDD displacement damage dose 

DQ memory data pin 

DDR Double Data Rate 

DDR2 Double Data Rate Class 2 (DDR3 etc.) 

DIMM dual inline memory module 
DRAM dynamic random access memory 
DQ data line where Q is 0-7 
DUT device under test 
E a activation energy for the leakage path 
FBGA fine ball grid array 
FPGA field-programmable gate array 
FSM finite-state machine 
FY fiscal year 

GSFC Goddard Space Flight Center 

IDD supply current flowing to the Vdd pins 

IDD(q) Idd drawn by device while in operating mode q. 

IC integrated circuit 

I/O input/output 

JPL Jet Propulsion Laboratory 

LCDT low-cost digital tester 

MCA mezzanine card A 

MCB mezzanine card B 

MCC mezzanine card C 

MDTS Modular Digital Test System 
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MPB3b Modular Digital Test System (MDTS) Prototype Board 3b 

MT/s Million/Mega Transfers per Second 

NEPP NASA Electronic Parts and Packaging 

SDRAM synchronous dynamic random access memory 

SSTL Stub Series Terminated Logic 

TID total ionizing dose 

TBC to be confirmed 

TBD to be determined 

USB Universal Serial Bus 

VTT termination power supply 
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